Pronunciation variation speech recognition without dictionary modification on sparse database
نویسندگان
چکیده
Generally, a speech recognition system uses a fixed set of pronunciations according to the dictionary for training and decoding. However, even a well-defined lexicon cannot be used to support all variations in human’s pronunciation. Besides, in order to cover all possible pronunciations, the size of the dictionary would be too large to implement. Sharing gaussian densities across phonetic models and decision tree for pronunciation variation are proved to be efficient for pronunciation variation system without dictionary modification. This paper presents the alternative methods that can be used even in the sparse database situation. Re-label training is modified to have rule-based pronunciation variation in order to obtain real phonetic acoustic models. Phonemic acoustic models are then retrained from the tying HMM states across phonetic models. These new phonemic models allow alternative search path during recognition. The system shows better performance in the experiment.
منابع مشابه
Pronunciation Variation Speech Recognition without New Dictionary Construction
Generally, a speech recognition system uses a fixed set of pronunciations according to the dictionary for training and decoding. However, even a well-defined dictionary cannot be used to support all variations in human’s pronunciation. Besides, in order to cover all possible pronunciations, the size of the dictionary would be too large to implement. This paper presents efficient strategies for ...
متن کاملAutomatic segmentation and clustering of speech using sparse coding
We investigate the application of sparse coding and dictionary learning to the discovery of sub-word units in speech. The ultimate goal is to generate pronunciation dictionaries that could be used for automatic speech recognition (ASR). A dictionary of sparse coding atoms is trained to code a subset of the TIMIT corpus. Some of the trained units exhibit strong correlation with specific referenc...
متن کاملModeling Pronunciation Variation for Cantonese Speech Recognition
Due to the large variability of pronunciation in spontaneous speech, pronunciation modeling becomes a more challenging and essential part in speech recognition. In this paper, we describe two different approaches of pronunciation modeling by using decision tree. At lexical level, a pronunciation variation dictionary is built to obtain alternative pronunciations for each word, in which each entr...
متن کاملSpeech Enhancement using Adaptive Data-Based Dictionary Learning
In this paper, a speech enhancement method based on sparse representation of data frames has been presented. Speech enhancement is one of the most applicable areas in different signal processing fields. The objective of a speech enhancement system is improvement of either intelligibility or quality of the speech signals. This process is carried out using the speech signal processing techniques ...
متن کاملImproving pronunciation modeling for non-native speech recognition
In this paper, three different approaches to pronunciation modeling are investigated. Two existing pronunciation modeling approaches, namely the pronunciation dictionary and n-best rescoring approach are modified to work with little amount of non-native speech. We also propose a speaker clustering approach, which capable of grouping the speakers based on their pronunciation habits. Given some s...
متن کامل